Goto

Collaborating Authors

 cognitive development


MIMo grows! Simulating body and sensory development in a multimodal infant model

López, Francisco M., Lenz, Miles, Fedozzi, Marco G., Aubret, Arthur, Triesch, Jochen

arXiv.org Artificial Intelligence

Infancy is characterized by rapid body growth and an explosive change of sensory and motor abilities. However, developmental robots and simulation platforms are typically designed in the image of a specific age, which limits their ability to capture the changing abilities and constraints of developing infants. To address this issue, we present MIMo v2, a new version of the multimodal infant model. It includes a growing body with increasing actuation strength covering the age range from birth to 24 months. It also features foveated vision with developing visual acuity as well as sensorimotor delays modeling finite signal transmission speeds to and from an infant's brain. Further enhancements of this MIMo version include an inverse kinematics module, a random environment generator and updated compatiblity with third-party simulation and learning libraries. Overall, this new MIMo version permits increased realism when modeling various aspects of sensorimotor development. The code is available on the official repository (https://github.com/trieschlab/MIMo).


Examining the effects of music on cognitive skills of children in early childhood with the Pythagorean fuzzy set approach

Kirisci, Murat, Topac, Nihat, Bardak, Musa

arXiv.org Artificial Intelligence

There are many genetic and environmental factors that affect cognitive development. Music education can also be considered as one of the environmental factors. Some researchers emphasize that Music is an action that requires meta-cognitive functions such as mathematics and chess and supports spatial intelligence. The effect of Music on cognitive development in early childhood was examined using the Pythagorean Fuzzy Sets(PFS) method defined by Yager. This study created PFS based on experts' opinions, and an algorithm was given according to PFS. The algorithm's results supported the experts' data on the development of spatial-temporal skills in music education given in early childhood. The algorithm's ranking was done using the Expectation Score Function. The rankings obtained from the algorithm overlap with the experts' rankings.


World Models in Artificial Intelligence: Sensing, Learning, and Reasoning Like a Child

Del Ser, Javier, Lobo, Jesus L., Müller, Heimo, Holzinger, Andreas

arXiv.org Artificial Intelligence

World Models help Artificial Intelligence (AI) predict outcomes, reason about its environment, and guide decision-making. While widely used in reinforcement learning, they lack the structured, adaptive representations that even young children intuitively develop. Advancing beyond pattern recognition requires dynamic, interpretable frameworks inspired by Piaget's cognitive development theory. We highlight six key research areas -- physics-informed learning, neurosymbolic learning, continual learning, causal inference, human-in-the-loop AI, and responsible AI -- as essential for enabling true reasoning in AI. By integrating statistical learning with advances in these areas, AI can evolve from pattern recognition to genuine understanding, adaptation and reasoning capabilities.


The Philosophical Foundations of Growing AI Like A Child

Luo, Dezhi, Li, Yijiang, Deng, Hokin

arXiv.org Artificial Intelligence

Despite excelling in high-level reasoning, current language models lack robustness in real-world scenarios and perform poorly on fundamental problem-solving tasks that are intuitive to humans. This paper argues that both challenges stem from a core discrepancy between human and machine cognitive development. While both systems rely on increasing representational power, the absence of core knowledge-foundational cognitive structures in humans-prevents language models from developing robust, generalizable abilities, where complex skills are grounded in simpler ones within their respective domains. It explores empirical evidence of core knowledge in humans, analyzes why language models fail to acquire it, and argues that this limitation is not an inherent architectural constraint. Finally, it outlines a workable proposal for systematically integrating core knowledge into future multi-modal language models through the large-scale generation of synthetic training data using a cognitive prototyping strategy.


CogDevelop2K: Reversed Cognitive Development in Multimodal Large Language Models

Li, Yijiang, Gao, Qingying, Sun, Haoran, Lyu, Haiyun, Luo, Dezhi, Deng, Hokin

arXiv.org Artificial Intelligence

Are Multi-modal Large Language Models (MLLMs) stochastic parrots? Do they genuinely understand? This paper aims to explore the core cognitive abilities that human intelligence builds upon to perceive, comprehend, and reason in MLLMs. To this end, we propose CogDevelop2K, a comprehensive benchmark that spans 12 sub-concepts from primitive knowledge like object permanence and boundary to more complex abilities like intentionality understanding, structured via the developmental trajectory of a human mind. We evaluate 46 MLLMs on our benchmarks. Surprisingly, we observe a reversed cognitive developmental trajectory compared to humans. Comprehensively, we further evaluate the influence of evaluation strategies and prompting techniques. Website with this $\href{https://growing-ai-like-a-child.github.io/}{link}$.


CogLM: Tracking Cognitive Development of Large Language Models

Wang, Xinglin, Yuan, Peiwen, Feng, Shaoxiong, Li, Yiwei, Pan, Boyuan, Wang, Heda, Hu, Yao, Li, Kan

arXiv.org Artificial Intelligence

Piaget's Theory of Cognitive Development (PTC) posits that the development of cognitive levels forms the foundation for human learning across various abilities. As Large Language Models (LLMs) have recently shown remarkable abilities across a wide variety of tasks, we are curious about the cognitive levels of current LLMs: to what extent they have developed and how this development has been achieved. To this end, we construct a benchmark CogLM (Cognitive Ability Evaluation for Language Model) based on PTC to assess the cognitive levels of LLMs. CogLM comprises 1,220 questions spanning 10 cognitive abilities crafted by more than 20 human experts, providing a comprehensive testbed for the cognitive levels of LLMs. Through extensive experiments across multiple mainstream LLMs with CogLM, we find that: (1) Human-like cognitive abilities have emerged in advanced LLMs (GPT-4), comparable to those of a 20-year-old human. (2) The parameter size and optimization objective are two key factors affecting the cognitive levels of LLMs. (3) The performance on downstream tasks is positively correlated with the level of cognitive abilities. These findings fill the gap in research on the cognitive abilities of LLMs, tracing the development of LLMs from a cognitive perspective and guiding the future direction of their evolution.


MIMo: A Multi-Modal Infant Model for Studying Cognitive Development

Mattern, Dominik, Schumacher, Pierre, López, Francisco M., Raabe, Marcel C., Ernst, Markus R., Aubret, Arthur, Triesch, Jochen

arXiv.org Artificial Intelligence

Human intelligence and human consciousness emerge gradually during the process of cognitive development. Understanding this development is an essential aspect of understanding the human mind and may facilitate the construction of artificial minds with similar properties. Importantly, human cognitive development relies on embodied interactions with the physical and social environment, which is perceived via complementary sensory modalities. These interactions allow the developing mind to probe the causal structure of the world. This is in stark contrast to common machine learning approaches, e.g., for large language models, which are merely passively ``digesting'' large amounts of training data, but are not in control of their sensory inputs. However, computational modeling of the kind of self-determined embodied interactions that lead to human intelligence and consciousness is a formidable challenge. Here we present MIMo, an open-source multi-modal infant model for studying early cognitive development through computer simulations. MIMo's body is modeled after an 18-month-old child with detailed five-fingered hands. MIMo perceives its surroundings via binocular vision, a vestibular system, proprioception, and touch perception through a full-body virtual skin, while two different actuation models allow control of his body. We describe the design and interfaces of MIMo and provide examples illustrating its use. All code is available at https://github.com/trieschlab/MIMo .


On the Computational Modeling of Meaning: Embodied Cognition Intertwined with Emotion

Kennington, Casey

arXiv.org Artificial Intelligence

How can machines understand language? is a question that many have asked, and represents an important facet of artificial intelligence. Large language models like ChatGPT seem to understand language, but as has been pointed out (Bender and Koller, 2020; Bisk et al., 2020), even large, powerful language models trained on huge amounts of data are likely missing key information to allow them to reach the depth of understanding that humans have. What information are they missing, and, perhaps more importantly, what information do they have that enables them to understand, to the degree that they do? Current computational models of semantic meaning can be broken down into three paradigms: distributional paradigms where meaning is derived from how words are used in text (i.e., the notion that the meaning of a word depends on the "company it keeps," following Firth (1957)) meaningfulness of language lies in the fact that it is about the world (Dahlgren, 1976) and grounded paradigms are where aspects of the physical world are linked to language (i.e., the symbol grounding problem following Harnad (1990)) formal paradigms where meaning is a logical form (e.g., first order logic as in L.T.F.


Why diversity and inclusion needs to be at the forefront of future AI

Robohub

Inês Hipólito is a highly accomplished researcher, recognized for her work in esteemed journals and contributions as a co-editor. She has received research awards including the prestigious Talent Grant from the University of Amsterdam in 2021. After her PhD, she held positions at the Berlin School of Mind and Brain and Humboldt-Universität zu Berlin. Currently, she is a permanent lecturer of the philosophy of AI at Macquarie University, focusing on cognitive development and the interplay between augmented cognition (AI) and the sociocultural environment. Neurourbanism as a Novel Approach in Global Health,' funded by the Berlin University Alliance.


Everything you need to know about BabyAGI - TechStory

#artificialintelligence

In recent months, we have seen the emergence and proliferation of several artificial intelligence systems worldwide, such as OpenAI's ChatGPT, GPT-4, and Google's Bard. Microsoft's new Bing and Baidu's Ernie Bot have also entered the scene. Joining this group of AI systems is a newcomer known as BabyAGI. BabyAGI is an innovative AI platform designed to train and evaluate various AI agents in a simulated environment. The AI is a pared-down version of the original Task-Driven Autonomous Agent developed and launched by VC and AI expert Yohei Nakajima.